There are many cases in quantum mechanics where we need to change the basis of a vector from one basis to another.
For example, in our discussion of spin states, we can use and as the basis vectors, but we can also use and as the basis vectors.
The change of basis is very important in many areas of physics.
For instance, the entirety of special relativity just boils down to changing the basis of spacetime vectors.
We now explore some mathematical concepts that will help us understand how to change the basis of a vector.
In this page, we shall employ a color scheme to help us distinguish between different bases:
The old basis is dark red, and the new basis is dark blue.
The old components are pink, and the new components are light blue.
Suppose and are two sets of orthonormal basis vectors.
In order to transform one basis to another, we need to find an operator that can transform the basis vectors.
Denote this operator as , such that .
In order to find out what this operator is, we can use a neat trick using the outer product.
Consider the following: .
When , this is just .
When , this is .
Therefore, we can write as a sum of all of these outer products:
Previously, we have seen that a Hermitian operator satisfies .
Let's see what happens when we take the adjoint of , remembering the rules of adjoints:
Notice what happens when we multiply and :
Since ensures that the sum is only nonzero when , we can remove one of the sums:
where we have used the completeness relation .
Therefore, . We call such an operator an unitary operator.
Unitary operators will be very important when we discuss the time-evolution of quantum states.
We can summarize what we have found in the following theorem:
Given two sets of orthonormal and complete basis vectors and , there always exists a unitary operator that transforms one basis to another such that .
As always, we can represent operators as matrices, and the change of basis operator is no exception.
Recall that the -th element of the matrix representation of an operator is .
Therefore, the matrix representation of is:
It is insightful to see that this is the same as the transformation matrix we have seen in linear algebra:
where and are two sets of orthonormal basis vectors.
In the case of curvilinear coordinates, we can use a combination of the metric tensor and the Jacobian to find the transformation matrix.
The next question we might ask is: how do the components of a vector change under a change of basis?
If one is familiar with tensors, one might recall that vectors have contravariant components; the components transform in the opposite way to the basis vectors.
It is easy to see why this is the case.
I borrow from Eigenchris on YouTube for this explanation.
Consider a vector with components , , and in the standard basis , , and .
Then, can be written as .
The key insight is that we can write this as the product of a row vector and a column vector, where the row vector contains the basis vectors and the column vector contains the components:
Next, suppose a new basis , , and is given and represented by some matrix .
Then, we can write the new basis row vector as the following:
We want to be able to write in terms of the new basis:
The key insight is that we can start from the original expansion and add anything in the middle that is equal to the identity matrix.
For instance, we can write :
As such, we can see that:
But because the transformation matrix is unitary, that is the same as taking the adjoint of the transformation matrix.
Then:
The fact that basis vectors are covariant (transform with the forward transformation ) and components are contravariant (transform with the inverse transformation ) is why the vector itself is invariant under a change of basis.
Intuitively, it makes sense that the components and basis "cancel" each other out.
The components of operators also transform under a change of basis.
Suppose we have a state vector and an operator .
The vector's components are in the old basis and in the new basis .
Recall that operators act on state vectors to produce new state vectors.
As such, to act on in , we can simply transform to the old basis , act with the operator , and then transform back to the new basis .
Thus, the operator in the new basis is equal to .
To see this more clearly, recall that can be written as a row-vector times a column-vector:
Performing an operator is equivalent to multiplying its matrix representation by the column-vector:
As we know, the basis vectors transform with the forward transformation and the components transform with the inverse transformation .
As such, we can write that:
Notice that the part in the middle, , must be the matrix representation of the operator in the new basis.
The transformation is called the similarity transformation of the operator .
Linear map transformations using tensors
In the language of multilinear algebra, a linear map is a tensor, meaning it has one contravariant index and one covariant index.
A multilinear map can generally be represented as .
and are the basis vectors and linear functionals/covectors, corresponding to and in Dirac notation.
is the tensor product, similar to the outer product that we have used.
All of this is just a different notation for what we have already seen.
However, the tensor notation make it very easy to see how the components of a tensor transform under a change of basis.
Applying a change of basis to both bases in the expression for yields:
Thus we see that , and hence .
This is the same as the similarity transformation we have seen before.
The trace of an operator is defined as the sum of the diagonal elements of the matrix representation of the operator.
In the context of quantum mechanics, the trace of an operator is very important because it is invariant under a change of basis.
Suppose we have an operator with matrix representation .
Then, the trace of the operator is:
Now, suppose we change the basis from to .
To show that the trace is invariant under a change of basis, we can use a few completeness relations:
Therefore, the trace of an operator is invariant under a change of basis.
Under a change of basis, we should be able to find a basis in which the operator is diagonal.
This is a very powerful tool because diagonal operators are very easy to work with.
Since the diagonal elements of a diagonal operator are the eigenvalues, we can diagonalize an operator by finding its eigenvalues and eigenvectors.
Given an operator , any eigenvector of satisfies the eigenvalue equation .
This can then be rewritten as , or .
In other words, the operator has a nontrivial null space, which is spanned by the eigenvector .
For to have a nontrivial null space, it must be singular, which means that .
Of course, this is the characteristic equation of the operator.
The characteristic equation is a polynomial of degree in .
If an operator has distinct eigenvalues, then it can be diagonalized by making the eigenvectors the basis vectors.
Otherwise, it is not diagonalizable.
Regarding diagonalization, we also have an important theorem:
Theorem: Let be an operator with non-degenerate eigenvalues (i.e. all eigenvalues correspond to a single eigenvector).
If and commute, then is diagonal in the basis of eigenvectors of .
Proof: We already know that is diagonal in the basis of eigenvectors of (by definition).
We also know that if and commute, then they share a common set of eigenvectors.
Thus, is diagonal in the basis of eigenvectors of because they are also the eigenvectors of .
The spectral decomposition of an operator is a way to write the operator as a sum of projectors onto the eigenvectors of the operator.
Given an operator with eigenvectors and eigenvalues , we can write the operator as:
Appendix: Proof of the Cyclic Property of the Trace
The cyclic property of the trace is a bit more complicated to prove.
We shall prove that , and the proof for the other cyclic properties follows similarly.
We will use two different notations to prove this property: Dirac notation and tensor notation.
Eventually, once we start studying relativstic quantum mechanics, we will need a solid grasp of both notations, so it is good to be familiar with both.
First, start by expanding the trace of and applying the completeness relation twice:
We can re-interpret the expression inside the sum as the product of three numbers: , , and .
Hence, because complex multiplication is commutative, we can rearrange the terms in the sum:
Now we re-interpret the expression in the sum back as a big matrix element with projection operators in the middle:
Now, because , we can remove the sums over and , and remove the projection operators:
We can also prove the cyclic property using tensor notation.
Recall that the matrix representation of an operator is a tensor, and the matrix representation of the operator is a tensor with three indices.
Let be the matrix representation of and be the matrix representation of .
Then, the -th element of is .
We know that this is the specific representation of in that order from a heuristic: notice that the indices connect diagonally across the three matrices:
The indices that are not paired become the indices of the resulting matrix.
The cyclic property of the trace is then:
The trace of is then:
Notice that we can reorder these terms on the right while retaining the diagonal connection of the indices:
Thus we have shown that .
The advantage of tensor notation is that it is very easy to see how the indices connect across the matrices, and how the cyclic property of the trace is a direct consequence of the connection of the indices.
Einstein's summation convention also helps to make everything more concise.
In fact, we can write this entire proof in a single line: